Skip to content

[CDRIVER-4153] Beginning of Header Hygiene & Verification #2053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

vector-of-bool
Copy link
Contributor

Refer: CDRIVER-4153

Summary

This PR is an exploration of how we can incrementally begin to clean up our header hygiene, put together on a whim, when I realized the following:

  1. We no longer allow users to include unqualified header names (e.g. No more <bson.h>, they must now do <bson/bson.h>)
  2. With the additional directory prefix, all of the bson- and mongoc- prefixes are redundant and somewhat annoying, and I'd like to get rid of those prefixes
  3. CMake 3.24 supports header verification, which will generate a compilation rule in an ephemeral target that only includes a single header.
  4. We can selectively enable header verification on individual files based on a name pattern.
  5. Because all bson- and mongoc- headers have guards against being included directly (via the -prelude.h headers), those files are not part of the public API, as users cannot access them directly. This gives us a filename pattern that we can use to deselect files for header verification.

Thus, this changeset. It works like this:

  • For any header file in bson/ or mongoc/ that does not begin with bson- or mongoc-, we want to perform header verification on those files.
  • To add a file to the verification check, rename it from e.g. bson/bson-foo.h to bson/foo.h. You can now run the build and header verification targets to check that the header can stand alone. This also gives the header a nicer name without a redundant bson- prefix, for public consumption as bson/foo.h
  • We don't use header verification directly on the bson_static and mongoc_static targets, as it (1) requires a newer CMake than we require of users and (2) only checks as C or as C++, but not both. Instead, two new placeholder targets are created that have the headers attached: One checks the headers as C, and the other as C++. (See: build/VerifyHeaders.cmake)

Note

This method can also supplant the need for the separate mongoc-cxx-check and public-header-warnings targets, since we can now roll those into the header verification directly (this PR doesn't do that).

Changes

The following is changed:

  1. Add a VerifyHeaders.cmake module with a mongo_verify_headers command. This creates two blank targets whose only purpose is the creation of the verification rules.
  2. Add selective header verification for bson and mongoc headers.
  3. As a basic proof-of-concept, the following headers were made "public", which involved moving the relevant components into those standalone headers:
    • bson/config.h is the new filename of the generate libbson config header.
    • bson/macros.h contains only compile-time preprocessor macros.
    • bson/compat.h contains OS and C library compatibility shims
    • bson/bson_t.h contains only the definition of the bson_t struct
    • bson/error.h contains bson_error_t and its associated funtions
    • bson/memory.h contains bson_malloc, bson_free, etc.
    • (The bson/bson.h header still transitively includes all of the above headers, so users can continue to use the "get everything" header)
  4. Fixed some minor C and C++ incompatibilities that were uncovered by enabling header verification.
  5. Modify the check-prelude.py to only consider files that have the bson- or mongoc- prefix, to mirror the behavior of our selective header verification.
  6. Add an Earthly target and Evergreen verify-headers task1 that runs header verification (against a few platforms, since differences in C stdlib headers can cause issues).

Footnotes

  1. Unfortunately, this may be confusing with the check-headers task, but I'd rather migrate to the name verify-headers since it directly corresponds to the name of the CMake functionality, and running them in an Earthly target runs faster in CI.

This change starts a new pattern for public API
headers: Headers that omit the "bson-" or "mongoc-"
prefix are intended for public `#include` directives, allowing users
to selectively import the APIs that they want.

This change required minimal modification of the existing code, as the
few headers that have been modified are very basic and have very few
dependencies.
This adds a CMake utility based around
VERIFY_INTERFACE_HEADER_SETS to check that certain headers can be
included directly in a translation unit without any prerequesits, in
either C or C++ code.

This verification only concerns itself with headers that we have already
made part of the public API. Private header files and header files
that are not-yet-sanitized are not checked by this process.

The verification does not yet happen automatically. It is only enabled
for CMake 3.24 or newer, and requires the dev/CI to build the special
linting target the CMake generates.
@vector-of-bool vector-of-bool marked this pull request as ready for review July 3, 2025 19:35
@vector-of-bool vector-of-bool requested a review from a team as a code owner July 3, 2025 19:35
@vector-of-bool vector-of-bool changed the title [CDRIVER-4153] PoC: Header Hygiene & Verification [CDRIVER-4153] Beginning of Header Hygiene & Verification Jul 3, 2025
Copy link
Contributor

@rcsanchez97 rcsanchez97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I suggested some minor spelling fixes.

Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite exciting changes.

Rather than defining <name>-verify-headers-c (and <name>-verify-headers-cpp) object library targets, can we consider conditionally (on CMake 3.24 and newer) and selectively (opt-in) adding the verified headers to the headers file set for the bson_static and mongoc_static library targets instead?

if (CMAKE_VERSION VERSION_GREATER_EQUAL "3.24")
  # When `CMAKE_VERIFY_INTERFACE_HEADER_SETS=ON`, creates the
  # `bson_static_verify_interface_header_sets` target and the
  # `all_verify_interface_header_sets` target.
  target_sources (bson_static PUBLIC FILE_SET HEADERS ...)
endif ()

In addition to slightly improving how we handle headers in the CMake configuration (e.g. for bson, currently bundled together with .c files in the glob-recursed all_sources variable), since file sets are completely opt-in, and since the VERIFY_INTERFACE_HEADER_SETS property is false by default (due to CMAKE_VERIFY_INTERFACE_HEADER_SETS being unset by default), I think pattern will better reflect the "for development purposes" nature and intended use of the VERIFY_INTERFACE_HEADER_SETS property, which we can enable when building test targets (i.e. in CI) using the CMAKE_VERIFY_INTERFACE_HEADER_SETS variable during CMake configuration. Quoting CMake docs:

Projects should not normally set this variable, it is intended as a developer control to be set on the cmake command line or other equivalent methods. The developer must have the ability to enable or disable header set verification according to the capabilities of their own machine and compiler.

If we want better control over the set of headers to be verified (that is, if we want to add more headers to the HEADERS file set than just those which we want to be verified; I think we want to eventually move them into header file sets anyways), we can use custom header set names together with the INTERFACE_HEADER_SETS_TO_VERIFY target property.

#include <bson/bson-types.h>
#include <bson/macros.h>
// For other APIs, not properly grouped, but needed:
#include <bson/bson.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this include of <bson/bson.h> + <bson/bson-prelude.h> above need to be updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this is the only non-bson--prefixed header which still includes <bson/bson-prelude.h> and <bson/bson.h>. All others (which are now defined as standalone-includable "public" headers) removed both includes in favor of a more streamlined list. I think this header should perhaps be renamed to <bson/bson-bcon.h> for consistency with the new header naming scheme.

@vector-of-bool
Copy link
Contributor Author

@eramongodb I initially tried using the verify-header-sets feature on bson_static directly, but there are two issues:

  1. It will only verify as C or as C++, not both (it depends on whether the target contains any C++ sources).
  2. If you add an interface header set to a target, then CMake will generate an error when you install() that target but don't specify an installation destination for the interface headers. This would require that we add and remove conditional install rules for our headers. I'd prefer to use header sets for installation, but that will require a newer CMake than we currently require for the project. CMake is considering support for verification of private header sets, but that's not currently available.

Given that we can't attach interface header sets to the target, we'll need to generate an extra target anyway to carry this information, and therefore I figured I should also add the ability to check both C and C++ separately.

@vector-of-bool vector-of-bool requested a review from eramongodb July 9, 2025 22:13
Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation. SGTM.

One last concern regarding the <bson/bcon.h> header; otherwise, LGTM.

#include <bson/bson-types.h>
#include <bson/macros.h>
// For other APIs, not properly grouped, but needed:
#include <bson/bson.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this is the only non-bson--prefixed header which still includes <bson/bson-prelude.h> and <bson/bson.h>. All others (which are now defined as standalone-includable "public" headers) removed both includes in favor of a more streamlined list. I think this header should perhaps be renamed to <bson/bson-bcon.h> for consistency with the new header naming scheme.

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments.

@@ -116,13 +116,13 @@ else ()
endif ()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can update the above comment "bson-config.h as preprocessor values" to "bson/config.h as preprocessor values".

#include <bson/bson-version-functions.h>

#include <bson/bson-version.h>

#include <bson/version.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bson-version.h is also referenced in check-files.py. I expect check-files.py can be removed (appears unused, and I think is no longer needed now that tarballs are not generated in CMake).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants